An Adaptive-Learning-Based Generative Adversarial Network for One-to-One Voice Conversion
نویسندگان
چکیده
Voice conversion (VC) emerged as a significant domain of research in the field speech synthesis recent years due to its emerging application voice-assistive technologies, such automated movie dubbing speech-to-singing conversion, name few. VC deals with vocal style one speaker another while keeping linguistic contents unchanged. Nowadays, generative adversarial network (GAN) models are widely used for feature mapping from source target speaker. In this article, we propose an adaptive-learning-based GAN model, called ALGAN-VC, improve one-to-one speakers. Our ALGAN-VC framework consists some approaches quality and voice similarity between We incorporate dense residual architecture into generator efficient learning also includes adaptive mechanism compute loss function proposed model. Moreover, boosted rate approach is incorporated enhance capability The model tested on Conversion Challenge 2016, 2018, 2020 datasets along our self-prepared Indian regional-language-based dataset. addition, emotional dataset considered evaluating model’s performance. objective subjective evaluations generated samples indicated that elegantly performed task by achieving high good quality.
منابع مشابه
Generative Adversarial Residual Pairwise Networks for One Shot Learning
Deep neural networks achieve unprecedented performance levels over many tasks and scale well with large quantities of data, but performance in the low-data regime and tasks like one shot learning still lags behind. While recent work suggests many hypotheses from better optimization to more complicated network structures, in this work we hypothesize that having a learnable and more expressive si...
متن کاملMetric Learning-based Generative Adversarial Network
Generative Adversarial Networks (GANs), as a framework for estimating generative models via an adversarial process, have attracted huge attention and have proven to be powerful in a variety of tasks. However, training GANs is well known for being delicate and unstable, partially caused by its sigmoid cross entropy loss function for the discriminator. To overcome such a problem, many researchers...
متن کاملEnergy-based Generative Adversarial Network
We introduce the “Energy-based Generative Adversarial Network” model (EBGAN) which views the discriminator as an energy function that associates low energies with the regions near the data manifold and higher energies with other regions. Similar to the probabilistic GANs, a generator is trained to produce contrastive samples with minimal energies, while the discriminator is trained to assign hi...
متن کاملAdaptive voice-quality control based on one-to-many eigenvoice conversion
This paper presents adaptive voice-quality control methods based on one-to-many eigenvoice conversion. To intuitively control the converted voice quality by manipulating a small number of control parameters, a multiple regression Gaussian mixture model (MR-GMM) has been proposed. The MR-GMM also allows us to estimate the optimum control parameters if target speech samples are available. However...
متن کاملSVSGAN: Singing Voice Separation via Generative Adversarial Network
Separating two sources from an audio mixture is an important task with many applications. It is a challenging problem since only one signal channel is available for analysis. In this paper, we propose a novel framework for singing voice separation using the generative adversarial network (GAN) with a time-frequency masking function. The mixture spectra is considered to be a distribution and is ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE transactions on artificial intelligence
سال: 2023
ISSN: ['2691-4581']
DOI: https://doi.org/10.1109/tai.2022.3149858